Globally Consistent Algorithms for Mixture of Experts

نویسندگان

  • Ashok Vardhan Makkuva
  • Sreeram Kannan
  • Pramod Viswanath
چکیده

Mixture-of-Experts (MoE) is a widely popular neural network architecture and is a basic building block of highly successful modern neural networks, for example, Gated Recurrent Units (GRU) and Attention networks. However, despite the empirical success, finding an efficient and provably consistent algorithm to learn the parameters remains a long standing open problem for more than two decades. In this paper, we introduce the first algorithm that learns the true parameters of a MoE model for a wide class of non-linearities with global consistency guarantees. Our algorithm relies on a novel combination of the EM algorithm and the tensor method of moment techniques. We empirically validate our algorithm on both the synthetic and real data sets in a variety of settings, and show superior performance to standard baselines. Department of Electrical and Computer Engineering, Coordinated Science Laboratory, University of Illinois at Urbana-Champaign, email: [email protected] Department of Electrical Engineering, University of Washington, Seattle, email: [email protected] Department of Electrical and Computer Engineering, Coordinated Science Laboratory, University of Illinois at Urbana-Champaign, email: [email protected]

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mixture of Experts for Persian handwritten word recognition

This paper presents the results of Persian handwritten word recognition based on Mixture of Experts technique. In the basic form of ME the problem space is automatically divided into several subspaces for the experts, and the outputs of experts are combined by a gating network. In our proposed model, we used Mixture of Experts Multi Layered Perceptrons with Momentum term, in the classification ...

متن کامل

An Improved Mixture of Experts Approach for Model Partitioning in VLSI{Design Using Genetic Algorithms

The partitioning of complex processor models on the gate and register-transfer level for parallel functional simulation based on the clock-cycle algorithm is considered. We introduce a hierarchical partitioning scheme combining various partitioning algorithms in the frame of a competing strategy. Melting together the di®erent partitioning results within one level using superpositions we crossov...

متن کامل

Discriminative Density Propagation for Visual Tracking

We introduce BM3E, a Conditional Bayesian Mixture of Experts Markov Model, for consistent probabilistic estimates in discriminative visual tracking. The model applies to problems of temporal and uncertain inference and represents the unexplored bottom-up counterpart of pervasive generative models estimated with Kalman filtering or particle filtering. Instead of inverting a non-linear generative...

متن کامل

Investigation of Mixture of Experts Applied to Residential Premises Valuation

Several experiments were conducted in order to investigate the usefulness of mixture of experts approach to an online internet system assisting in real estate appraisal. All experiments were performed using real-world datasets taken from a cadastral system. The analysis of the results was performed using statistical methodology including nonparametric tests followed by post-hoc procedures desig...

متن کامل

Mixture of Experts Classification Using a Hierarchical Mixture Model

A three-level hierarchical mixture model for classification is presented that models the following data generation process: (1) the data are generated by a finite number of sources (clusters), and (2) the generation mechanism of each source assumes the existence of individual internal class-labeled sources (subclusters of the external cluster). The model estimates the posterior probability of c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1802.07417  شماره 

صفحات  -

تاریخ انتشار 2018